Improving Cancer Gene Expression Data Quality through a TCGA Data-Driven Evaluation of Identifier Filtering

نویسندگان

  • Kevin K. McDade
  • Uma Chandran
  • Roger S. Day
چکیده

Data quality is a recognized problem for high-throughput genomics platforms, as evinced by the proliferation of methods attempting to filter out lower quality data points. Different filtering methods lead to discordant results, raising the question, which methods are best? Astonishingly, little computational support is offered to analysts to decide which filtering methods are optimal for the research question at hand. To evaluate them, we begin with a pair of expression data sets, transcriptomic and proteomic, on the same samples. The pair of data sets form a test-bed for the evaluation. Identifier mapping between the data sets creates a collection of feature pairs, with correlations calculated for each pair. To evaluate a filtering strategy, we estimate posterior probabilities for the correctness of probesets accepted by the method. An analyst can set expected utilities that represent the trade-off between the quality and quantity of accepted features. We tested nine published probeset filtering methods and combination strategies. We used two test-beds from cancer studies providing transcriptomic and proteomic data. For reasonable utility settings, the Jetset filtering method was optimal for probeset filtering on both test-beds, even though both assay platforms were different. Further intersection with a second filtering method was indicated on one test-bed but not the other.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of Long Non-Coding RNAs: HNF1A-AS1 and MVIH Expressions and Their Clinical Significance in Human Gastric Cancer

Gastric cancer is one of the most common cancers in the world. Late diagnosis is the main cause of the high rate of treatment failure and death among patients with gastric cancer; therefore identifying the molecular basis of cancer initiation and metastasis is so critical for developing efficient methods for early diagnosis and therapy. long non-coding RNAs (lncRNAs) are the largest group of no...

متن کامل

Augmented expression levels of lncRNAs ecCEBPA and UCA1 in gastric cancer tissues and their clinical significance

Objective(s): As the second cause of cancer death, gastric cancer (GC) is one of the eminent dilemmas all over the world, therefore investigating the molecular mechanisms involved in this cancer is pivotal. Unrestricted proliferation is one of the characteristics of cancerous cells, which is due to deficiency in cell regulatory systems. Long non-coding RNAs (lncRNAs) have emerged as critical re...

متن کامل

Identification of Prognostic Genes in Her2-enriched Breast Cancer by Gene Co-Expression Net-work Analysis

Introduction: HER2-enriched subtype of breast cancer has a worse prognosis than luminal subtypes. Recently, the discovery of targeted therapies in other groups of breast cancer has increased patient survival. The aim of this study was to identify genes that affect the overall survival of this group of patients based on a systems biology approach. Methods: Gene expression data and clinical infor...

متن کامل

Evaluation of PRR11 gene expression changes and its relationship with tumor size in patients with gastric adenocarcinoma

Introduction: Gastric cancer is one of the most common gastrointestinal tract neoplasms. Because of its invasion, and nonspecific symptoms and signs, the disease is often diagnosed at an advanced stage with short survival. PRR11 participates in the initiation and progression of lung cancer and breast cancer by regulating important genes involved in cell cycles and tumorigenesis. In this researc...

متن کامل

Evaluation of the Prognostic Value and TRIP13 gene Expression in Gastric Cancer

Introduction: Gastric cancer is a major public health issue worldwide. The factors that initiate cancer are not well understood, however aberrant expression of genes is associated with this cancer. TRIP13 plays pivotal roles in meiotic recombination, DNA repair, and cell cycle progression. An increasing body of evidence suggests that TRIP13 may possess functions other than meiosis and mitosis, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 14  شماره 

صفحات  -

تاریخ انتشار 2015